Skip to content

docs(dcgm): add comprehensive package-level documentation and improve godoc#113

Open
cluster2600 wants to merge 1 commit intoNVIDIA:mainfrom
cluster2600:docs/add-package-doc-and-improve-godoc
Open

docs(dcgm): add comprehensive package-level documentation and improve godoc#113
cluster2600 wants to merge 1 commit intoNVIDIA:mainfrom
cluster2600:docs/add-package-doc-and-improve-godoc

Conversation

@cluster2600
Copy link

What

  • Add pkg/dcgm/doc.go with a comprehensive package-level documentation comment
  • Fix a stray package comment in diag.go (mid-file, not at the package declaration) that confused godoc parsers
  • Remove the now-duplicate package comment from hostengine_status.go
  • Expand godoc for WatchFieldsWithGroupEx to document all three parameters and their units
  • Expand godoc for WatchFieldsWithGroup with default parameter values and a complete runnable example

Why

The package currently has no doc.go and the only package-level comment is a single line spread across two files (diag.go and hostengine_status.go). There are no usage examples in the package documentation, which makes it difficult for new users to:

  • Discover the correct initialisation sequence (Init modes)
  • Understand the relationship between field groups, GPU groups, and watches
  • Know which cleanup functions to call (and when) to avoid resource leaks
  • Find the health-check, policy-violation, and diagnostics APIs

Without this documentation the entry point for learning the package is the samples/ directory, which is not linked from pkg/go.dev and requires reading multiple source files.

How

New pkg/dcgm/doc.go covers:

  • Package overview and DCGM background
  • All three Init modes (Embedded, Standalone, StartHostengine) with examples
  • Complete field-watch workflow (FieldGroupCreate → WatchFieldsWithGroup → GetValuesSince → cleanup)
  • GPU group management (GroupAllGPUs, CreateGroup, AddToGroup, DestroyGroup)
  • Health check setup and result inspection
  • Policy violation monitoring with context cancellation
  • Diagnostics (RunDiag)
  • Thread-safety guarantees
  • Resource management patterns and the consequences of leaking handles

diag.go: Remove the // Package dcgm ... line at line 14 (inside the file body, after the import block). Go's godoc tool ignores package comments that are not immediately above the package keyword, but having them in the middle of a file is confusing and was creating noise in editors.

hostengine_status.go: Remove the duplicate // Package dcgm ... comment now that doc.go is the canonical package doc location. Go uses the first package comment it finds; multiple comments in different files produce unpredictable output.

fields.go:

  • WatchFieldsWithGroupEx: add parameter descriptions with units (microseconds for updateFreq, seconds for maxKeepAge)
  • WatchFieldsWithGroup: add the concrete default values (30 s, unlimited age, 1 sample) and a copy-paste-ready example

Testing

  • go vet ./... passes (requires CGo toolchain; the package comment changes have no runtime effect)
  • go doc ./pkg/dcgm now renders the full package overview
  • Verified no duplicate package comments remain with grep -rn "Package dcgm" pkg/dcgm/

Checklist

  • Documentation only — no logic changes, no API changes
  • All examples in doc.go follow patterns already established in samples/
  • go build compatible (doc.go contains only a package comment and the package dcgm declaration)
  • No new dependencies

Add pkg/dcgm/doc.go with a full package-level documentation comment that
covers the package purpose, all three Init modes, field watching workflow,
GPU group management, health checks, policy violation monitoring, diagnostics,
thread-safety guarantees, and resource management patterns.

The package currently had a one-line comment scattered between two files
(diag.go and hostengine_status.go) with no usage examples.  This made it
difficult for new users to discover the correct initialisation sequence or
understand how the various subsystems relate to each other.

Additional changes:
- Remove the stray '// Package dcgm ...' line from diag.go (mid-file,
  not at the package declaration) which confused godoc parsers.
- Remove the duplicate package comment from hostengine_status.go now
  that doc.go is the canonical location.
- Expand the godoc for WatchFieldsWithGroupEx to document all three
  parameters (updateFreq, maxKeepAge, maxKeepSamples) and their units.
- Expand the godoc for WatchFieldsWithGroup to document the default
  parameter values and include a complete runnable example.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant